Probabilities and Frequency Distributions
POLS 3312: Argument, Data, and Politics
2024-01-31
Announcements
- Grades in Canvas
- Paper will be posted in Canvas for Monday. You do not need to read it in advance. We will go through it Monday as an example of how to read an academic paper. You DO NEED to have a copy either electronically or printed with you on Monday.
Overview
- Basic overview of probability
- Frequency Distributions
Probability
- Probability is a measure of the likelihood of an event
- Probability is a number between 0 and 1 (or 0 and 100%)
- 0 means the event is impossible
- 1 means the event is certain
- 0.5 means the event is as likely as not
Finding probability
- Probability is the ratio of the number of times an event occurs to the total number of trials
- For example, if we flip a coin 10 times and get 5 heads, the probability of heads is 5/10 or 0.5
- If we flip a coin 100 times and get 50 heads, the probability of heads is 50/100 or 0.5
Why do we use data?
- Purpose: analyzing data for causal inference (to begin to make statements about cause and effect - inferring causes)
- Complex and uncertain data requires that we make…
Assumptions about the data
- Because the world is complex, to make sense of unknowns we make assumptions about data
- The assumptions are useful approximations even when not preceisely true
- We still need to check that the real data does not seriously violate the assumptions
Data Assumptions: Random, Independent, and Identically Distributed
- Randomness and independence matter as assumptions about data
- Specifically, these are assumptions about the Data Generating Process or DGP
- The Data Generating Process: the way the world produces the data
The Data Generating Process
- The source of the data matters - the DGP matters
- Experiment vs observation are one way DGP matters
- Previously stated: Data comes from a random world
- So the DGP has a random element
Independence and Distribution
- Events in the data are independent and identically distributed - the IID assumption
Independence and Distribution
Events in the data are independent and identically distributed - the IID assumption
Independence is statistical independence - the outcome of one event does not affect our belief about the probability of another event
We can draw a random number from a hat, then flip a coin. The hat draw does not affect the probability of the coin toss
Independence and causation
- Falsifiability assumption: X does not affect Y
If X does appear to affect Y, we may begin to infer some direct or indirect causal relationship in some direction somewhere possibly through one or more additional variables, but not necessarily that X causes Y. This is commonly shortened to the not quite accurate summary “correlation does not imply causation.”
Independence and Distribution
- Events in the data are independent and identically distributed - the IID assumption
- Independence is statistical independence - the outcome of one event does not affect our belief about the probability of another event
- Identically distributed: drawn from the same probability distribution
So…
Introduction to distributions
- The most important is the normal distribution
- This is because of the central limit theorem
- We will look at these in the most detail: normal, binomial, uniform, poisson
Distribution examples
- The following are histograms
- They represent the frequency or simply the number count of observations for each value
- For example, if the value 4 shows 500, it means there that 4 came up 500 times in the data
- The graphs were produced by generating random numbers based on the particular distribution with an R function
Normal Distribution
- symmetrical around its mean with most values near the central peak
- width is a function of the standard deviation
- Other names: Gaussian distribution, bell curve
Normal Distribution
Binomial Distribution
- binary
- success/failure
- yes/no
- distribution for a number of Bernoulli trials
Binomial example
- n = 1 makes this a Bernoulli distribution
Binomial example
Preview of the Central Limit Theorem
What happens if we do the same thing above but do it 1,000 times and plot the counts?
Preview of the Central Limit Theorem
The Central Limit Theorem
- For sufficiently large sample sizes, the distribution of sample means approximates a normal distribution
- This means with a large enough number of trials, we can apply the normal distribution to know things about measures of central tendency, measures of dispersion, and probabilities
- Sample sizes above 30
- This is just a preview
68-95-99.7 Rule
- One of the rules for normal distributions is:
The 68-95-99.7 rule
- 68% of the data is within 1 standard deviation of the mean
- 95% of the data is within 2 standard deviations of the mean
- 99.7% of the data is within 3 standard deviations of the mean
The Law of Large Numbers
- The law of large numbers tells us that if we repeat an experiment a large number of time, the average of the results will be close to the expected value
- This allows us to apply the actual mean of the sample to the expected mean of the population
Poisson distribution
- Count of number of events in a fixed time/space
- Known constant mean rate of occurrence
- Independent of time since last event
Poisson distribution
Why we can’t use standard OLS regression for other DGP
- We base the likelihood of something being significant on the proximity to the mean
- As things get further from the mean in a normal distribution, they become less likely
Why we can’t use standard OLS regression for other DGP
Why we can’t use standard OLS regression for other DGP
Authorship and License
![Creative Commons License]()
